Sparse LU Decomposition using FPGA
نویسندگان
چکیده
This paper reports on an FPGA implementation of sparse LU decomposition. The resulting special purpose hardware is geared towards power system problems load flow computation which are typically solved iteratively using Newton Raphson. The key step in this process, which takes approximately 85% of the computation time, is the solution of sparse linear systems arising from the Jacobian matrices that occur in each iteration of Newton Raphson. Current state-of-the-art software packages, such as UMFPACK and SuperLU, running on general purpose processors perform suboptimally on these problems due to poor utilization of the floating point hardware (typically 1 to 4% efficiency). Our LU hardware, using a special purpose data path and cache, designed to keep the floating point hardware busy, achieves an efficiency of 60% and higher. This improved efficiency provides an order of magnitude speedup when compared to a software solution using UMFPACK running on general purpose processors.
منابع مشابه
Criticality-driven Token Dataflow Optimizations for FPGA-based Sparse LU Factorization
Performance of FPGA-based token dataflow architectures is often limited by the long tail distribution of parallelism in the compute paths of dataflow graphs. This is known to limit speedup of dataflow processing of Sparse LU factorization to only 3– 10× over CPUs. In this paper, we show how to overcome these limitations by exploiting criticality information along compute paths; both statically ...
متن کاملCriticality-driven Token Dataflow Optimizations for FPGA-based Sparse LU Factorization
Performance of FPGA-based token dataflow architectures is often limited by the long tail distribution of parallelism in the compute paths of dataflow graphs. This is known to limit speedup of dataflow processing of Sparse LU factorization to only 3– 10× over CPUs. In this paper, we show how to overcome these limitations by exploiting criticality information along compute paths; both statically ...
متن کاملParallel Direct Solution of Linear Equations on FPGA-Based Machines
The efficient solution of large systems of linear equations represented by sparse matrices appears in many tasks. LU factorization followed by backward and forward substitutions is widely used for this purpose. Parallel implementations of this computation-intensive process are limited primarily to supercomputers. New generations of Field-Programmable Gate Array (FPGA) technologies enable the im...
متن کاملHigh-Performance Linear Algebra Processor using FPGA
With recent advances in FPGA (Field Programmable Gate Array) technology it is now feasible to use these devices to build special purpose processors for floating point intensive applications that arise in scientific computing. FPGA provides programmable hardware that can be used to design custom hardware without the high-cost of traditional hardware design. In this talk we discuss two multi-proc...
متن کاملFPGA Based Efficient Cholesky Decomposition for Solving Least Square Problem
The paper presents FPGA based design & implementation of Cholesky Decomposition for matrix calculation to solve least square problem. The Cholesky decomposition has no pivoting but the factorization is stable. It also has an advantage that instead of two matrices, only one matrix multiplied by itself. Hence it requires two times less operation. The Cholesky decomposition has been designed & sim...
متن کامل